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Complex question answering (CQA) is used for human knowledge 
answering and community questions answering. CQA system is essential to 
overcome the complexities present in the question answering system. The 
existing techniques ignores the queries structure and resulting a significant 
number of noisy queries. The complex queries, distributed knowledge, 
composite approaches, templates, and ambiguity are the common challenges 
faced by the CQA. To solve these issues, this paper presents a new manta 
ray foraging optimized deep contextualized bidirectional long-short term 
memory based adaptive galactic swarm optimization (MDCBiLSTM- 
AGSO) for CQA. At first, the given input question is preprocessed and the 
similarity assessment is performed to eliminate the misclassification. 
Afterwards, the attained keywords are mapped into applicant results to 
improve the answer selection. Next, a new similarity approach named 
InfoSelectivity is introduced for semantic similarity evaluation based on the 
closeness among elements. Then, the relevant answers are classified through 
the MDCBiLSTM and optimized by a new manta ray foraging optimization 
(MRFO). Finally, adaptive galactic swarm optimization (AGSO) resultant is 
the best output. The proposed scheme is implemented on the JAVA platform 
and the outputs of designed approach achieved the better results when 
compared with the existing approaches in average accuracy (98.2%). 
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1. INTRODUCTION 


Question answering (QA) is the process of finding answers for the question posted in common 
language automatically. QA framework requires the capability of understanding natural language linguistics, 
text and common knowledge. It is also the key application in information retrieval, natural language 
processing and machine learning [1]. Over the knowledge graphs, the templates plays an important role in 
QA. Because the templates are used for mappings and leveraging to receive the specific answers [2], [3]. The 
computation of benchmarking plays a significant part in enhancing scalable question answering (SQA) 
frameworks. Only the final answers per input natural language question (NLQ) are essentially evaluated; 
while there are number of existing evaluation schemes [4] In NLQ, the question is in the form of Natural 
language like English. QA systems also described as a powerful platform for answering questions 


automatically [5]. 
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Querying distributed knowledge, disambiguation, question analysis, query construction, and phrase 
mapping are the tasks analyzed in information retrieval system [6], [7]. Since the exponential growth of 
online information, text summarization has become increasingly necessary for applications such as question 
answering, report generation and name recommendations [8]. Online information needs a considered 
substantial fraction as time-dependent. Testing temporal conditions are presented in computing answers even 
when the searching demand does not explain the events or dates [9]. Over a linked data challenge, the SQA 
needs to find a route to transfer a query into a question based on knowledge base. The range of question is 
from simple to complex, which marks a solitary statistic about a single object [10]. 

In a QA pipeline, the modern QA system wants to incorporate several constraints specialized to 
achieve particular work. A greedy algorithm is utilized for creating a QA pipeline having the best performing 
constraints for the question [11]. Compared with neural models, this provides lesser performance because the 
models performs less training. Recently, the advantage of neural network model has been developed. In this 
scenario, the recurrent neural network model is globally used [12]. Response and rank the candidate answers 
are the problems that occurred in these applications [13]. The question to be answered is not identified till run 
time. For that reason, it is diverse from several issues in computer vision. Therefore, the visual question 
answering is a critical problem than visual data captioning since it involves getting data not present in the 
visual data [14]. The multimodal hybrid feature (MHF) extraction scheme attains more discriminative visual- 
question representation with the help of more difficult high-order interactions between multimodal features. 
Moreover, the results provide substantial development on the visual question answering (VQA) performance 
[15]. VQA is a research area about building a computer system to answer questions presented in an image 
and a natural language. To overcome the above issues, QA modular frameworks has introduced by the 
researchers for recyclable elements named open knowledge base and question answering (OKBQA) [16]. 

The system analyze the query layer-by-layer, and its constituents are matched through knowledge 
base (KB) schema. The use of KBs simplifies the problem by separating the issue of information collection 
and organization from one of searching through it. Both the complex and simple questions have high 
precision for answering by testing grammatical question answering (GQA) on SQA2018 training set [17]. 
Structured query languages are too expressive in comparison. But, it is quite complicated for users [18], [19]. 
In real time, performing the task is very difficult because of the improvement in search policy and 
computation of user satisfaction. Hence, a reinforcement learning formulation is introduced for the complex 
question answering. This improves the accuracy based on the need of user’s information [20]. 

The foremost contributions of this research paper are discussed; 

— This paper presented a new manta ray optimized deep contextualized bi-directional long short-term 
memory (MDCBiLSTM) with galactic swarm optimization (AGSO) framework for complex question 
answering (CQA), because the better answer selection and template based methods are required for 
efficient question answering system. 

— To identify the matching quality of answers for a given question, the new InfoSelectivity approach is 
introduced. It customs the deviation quantity of a data among the distributions joined with a typical metric 
distance. 

— To perform an accurate complex question answering, anta ray optimized DCBiLSTM is introduced for 
the answer selection. In a wide range of natural language processing (NLP) works, the deep learning- 
based approaches are obtained an advanced performance. To advance this work, this neural network 
architecture is introduced, capable of learning to attend to the cherished information in a sentence by 
presenting a contextual term. Here, manta ray foraging optimization (MRFO) is introduced to assign a 
weight to the entire position at a lower level of a neural network when evaluating a higher-level 
representation. 

— To select the best quality answer MDCBiLSTM model is used. Moreover, AGSO is used for weight 
updates to rank the appropriate answer. 

The remaining part of this research paper is organized: section 2 discussed the recent research works 
related to the CQA. Next, section 3 gives the explanation about the proposed method in detail. In section 4, 
results and discussion part is described, and finally, the conclusion of the research paper is given in section 5. 


2. LITERATURE REVIEW 

Qiu et al. [21] presented a director-actor-critic scheme to eliminate the CQA scheme challenges. 
The query graph creation was formulated as a hierarchical decision issue through options on a Markov 
decision procedure. The query graph essentials were determined by the director, and equivalent triples by 
selecting the edges and given questions through the critic. Furthermore, the scheme over hierarchical 
reinforcement learning using intrinsic motivation was introduced for training from weak supervision. A critic 
with high-reward trajectories was pre-trained to speed up the training process that was created by hand- 
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crafted rules. The query graph generation gradually increase the complexity of questions and also the 
curriculum learning was leveraged. Over widely-used benchmark datasets, the extensive experiments were 
conducted to display the proposed framework effectiveness. 

A new formal query building scheme was introduced by Chen et al. [22] that includes two stages. 
Initially, the question’s query structure was predicted and to make the candidate queries generation, and the 
structure was leveraged. The graph generation framework handle the structure prediction task. In each 
generative step, an encoder-decoder model was designed to forecast the predetermined operation argument. 
Secondly, the existing schemes were utilized for ranking the candidate queries. The introduced formal query 
generating approach outperforms than others shown by experimental outcomes over complex questions. 

Maheshwari et al. [23] presented an empirical examination of neural request graph ranking schemes 
for CQA on knowledge graphs. Self-attention based slot matching design was proposed with six different 
ranking models that abuse the inherent structure of query graphs. The introduced scheme outperforms the 
other previous schemes on two QA datasets over the database-pedia knowledge graph, computed in 
dissimilar settings. Besides, transfer learning from the larger of those QA datasets was displayed here to the 
smaller dataset produced substantial developments and efficiently equalizing the general lack of training 
information. 

Reddy and Madhavi [24] have proposed a template illustration based convolutional recurrent neural 
network (T-CRNN) for choosing an answer in CQA scheme. Moreover, recurrent neural network (RNN) was 
utilized for achieving the particular correlation among the questions and the answers. Also, it was utilized to 
get the semantic similarity among the answers collection. The convolutional neural network (CNN) was 
accomplished the process of learning that displays the questions and answers individually. With the softmax 
classifier, the correctly correlated answers are recognized for the given question. 

Hierarchy based firefly optimized k-means clustering (HFO-KC) design for CQA was proposed by 
Reddy and Madhavi [25]. When compared with the strings, it eliminates the misclassification. The answer 
selection process was improved by mapping the keywords into the applicant solutions. The keywords were 
segmented after the mapping process. For the document retrieval process, the Okapi-25 similarity 
computation approach was utilized. With clustering of K-means approach, the selected answers were 
classified and that creates the hierarchy for all answer. Finally, the best quality of answer was selected by the 
firefly optimization algorithm from the hierarchy. 

Esposito et al. [26] presented a hybrid query extension with cosine similarity based embedding for 
retrieval of texts in question answering. In that synonyms of the terms were initially extricated from the 
MultiWordNet. Afterwards contextualized to the gathered documents utilized in the QA framework. At the 
end, the output set ordered and filtered depends on the words and the sense of question through a cosine 
similarity, and the answers were formulated. The presented technique generates accurate answers to the 
questions to assess its effectiveness. 

Spatial or temporal templates, lexical Gap, distributed knowledge procedural, complex operators, 
multilingualism and ambiguity are the challenges focused on the question answering system. Question 
classification problem can be defined as “categorizing questions into various semantic classes based on 
possible semantic type of answers”. It helps to place a constraint on what constitutes relevant data for the 
answer and significant information about the nature of the answer. Converting the user information in the 
form of evaluation is also the problem in question answering. To overcome these issues, the new method is 
proposed in this research. 


3. PROPOSED METHOD 

CQA system is introduced to answer several kinds of questions precisely. One of the types of 
information retrieval is CQA, here the query is given in the natural language question format, and the reply is 
in the natural language answer format. In this research, deep contextualized BiLSTM based AGSO 
optimization for CQA. First, preprocessing is activated, then the keywords are extracted. From the obtained 
keywords, the applicant resolutions are mapped. After that, by using question-answer data set the correct 
answer is retrieved. It can be achieved with InfoSelectivity computation. The large number of answers are 
picked based on the score of similarity for the given query. DCBi-LSTM with MRFO is used to pick the 
appropriate answer for the complex question. From the appropriate answers, the optimized result is obtained 
with AGSO. The schematic diagram of proposed MDCBiLSTM-AGSO approach for CQA is depicted in 
Figure 1. 


3.1. Pre-processing and segmentation 


Initially, pre-process the question and its candidate answer of raw document by using the strategies 
named as stop words removal, word extraction, stemming and also similarity assessment. These are the 
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important approaches in pre-processing of textual information. Based on the word exists in the user opinion 
feedback reviews, the stop words are first eliminated and also the stemming process is done. After the stop 
words removal and stemming process, the exact keyword is extracted. Afterwards, the templates are created 
for each keyword. Next, the created templates are combined to form a segment, and the number of answers 
are selected for each segment. The pre-processing steps in question answering is depicted in Figure 2. 
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Figure 1. Flow of proposed approach for CQA 
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Figure 2. Flow of pre-processing and segmentation 
Manta ray optimized deep contextualized bi-directional ... (Ankireddypalli Chandra Obula Reddy) 


3998 O ISSN: 2088-8708 


3.2. New InfoSelectivity based similarity calculation for answer selection 

The similarity evaluation is a great interest in many research areas like information retrieval, text 
analysis, pattern recognition, machine learning, and data sciences. Mainly, to compare the possibility 
distributions, the undefined data in the possibility system are returns. Hence, the similarity among two 
prospect distributions are computed by similarity methods. 

The new similarity measure is calculated based on two effective measures like selectivity and 
distance measure. The selectivity measure that permits to measure the variation of number of data and the 
distance measure compares the information accuracy using ‘a point to point’ system. A standard normalized 
Manhattan distance is utilized here to equate the point to point distributions. Besides, the data are exploited 
based upon the possibility degree ranking fitting to the distribution care, which is stated by the minimum of 
selectivity principle. The selectivity relation among the possibility distributions has been combined for 
qualifying the native relation among the equated possibility distributions. The new selectivity measure is 
computed by the subsequent condition (1): 


b(t) = (Q1) — Xu=2 WG ty T(R, )m(Q) E [0,1]; &wgt, € [0,1] (1) 


where, wgt,, represents the weight; and Xr- wgt„, = 1. Without a requirement for normalization, the 
measure fits to the unit interval [0, 1] and it provides the key properties of a possible similarity measure. 
With two weights like u and v, a new similarity degree is defined and that are being utilized for balancing the 
distance measure and the selectivity measure. InfoSelectivity: Consider, T4 and T, are the two distributions. 
An InfoSelectivity is defined as the information specificity measure represented by, 


UD Manh(T1T2)+V.Dset1 T2) , 
ae fD +1 
InfoSelectivity (14, T2) = uty f Duanı (2) 
0 if Duanh #1 


where, Dee (11,72) = |Se (m) — Se(m2)| 
u E [0,1], v € [0,1] 


where, v and u defines the two coefficients; Dse and Dinan; represents the difference between two possibility 
distributions and the normalized distance of Manhattan among the compared distributions respectively. 


3.3. DCBi-LSTM with MRFO based answer classification 

By verifying the similarity, a score is given to the hybrid DCBi-LSTM-MRFO. The hybrid 
DCBi-LSTM-MRFO framework provides accurate answers for the complex questions. Here, the introduced 
model includes four units such as word depiction, phrase depiction, phrase classification and phase 
optimization using MRFO. 


3.3.1. Phase of word depiction 

Let, M = [m,,mz,...Mj...My] € gr Ow2vtPetmo) be the input for the model and it is defined as the 
informal phrase. Where, Deimo and D2, are represents the embedding from learning model (ELMo) and 
word2vec’s dimensions; n represents the phrase length. Here, m; represents the it” word embedding in the 
phrase and it is achieved by a concatenation of ELMo embeddings and word2vec represented by (3), 


M= Ewzv © Eeimo (3) 


where, ® represents the row-wise concatenation; Eemo = [ê4, ê, ... ên] E R” Pemo represents the ELMo 
embedding of the word W ; and Ey., = [e1, €2,....€n] E gee represents the word2vec embedding matrix 
and the entire rowe;denotes word2vec embedding of the word. 


3.3.2. Phrase depiction 

From the input phrase matrix M, this module studies the fixed-length phrase vector illustration. The 
long term dependencies are effectively modelled by the bidirectional long-short term memory (LSTM) and in 
specific, the contextual information are effectively captured by the BiLSTM by encoding the words in both 


forward (n;) and backward (ni) directions: 


Mn =; Bn, € RY, i = 1,2,3....n é 
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where, ni = BLSTM(W;,) € RZ and Ni = FLSTM(W;,) € 8%’? where, BLSTM and FLSTM represents the 
back ward and forward directional processing of words in LSTM correspondingly; 7, indicated as the state 
of hidden state attained after treating the final word. 7, is concatenated with a vector to increase the phrase 
representation achieved by averaging the ELMo embedding of entire words in the phrase, i.e: 


1 x 
q = Mn OB Diz ĉi (5) 
where, êjindicated the iword’s embedding in the phrase. 


3.3.3. Phrase classification 

Using two activation functions like softmax and ReLU, this module maps the phrase illustration 
vector q to the standard concepts ¢1,¢2,..... Cx respectively. In this module, K indicates the entire unique 
concepts in the data set. The output vector represented as ĵ is evaluated by (6), (7) 


r = ReLU(Wgt,q + B1) (6) 
9 = soft max(Wgtzr + B3) (7) 


where, B,,B,,Wgt, and Wgt, represents the learnable parameters like biases and weights. With each 
element, the predicted vector denoted as f and it represents the length k's probability vector, 9; € [0,1] 
denotes the probability that the phrase is allocated to the concept and it is represented as ¢; andy. It is the 
ground truth vector and also it is one hot encoded. Then, the weights and biases of the network is updated by 
MRFO scheme. 

First, the manta ray’s population is initialized. After that, the stated foraging approaches are placed 
to the entire population. Primarily to each individual, the chain foraging strategy is applied by (8): 


XP (T) + rand(xBose — xP(T)) + wot(xBese xP (T)), i=1 


D 
xi 7 +1) = 
xP (1) + rand(xPa — xP(T)) + wat(xBese—xP(T)) iemi#t 


(8) 


where, Ypest represents the best individuals; wgt represents the weight coefficient, here 


wgt = 2rand,/|(log(r))|; rand indicates the random number in the range of [0, 1]; y? represents the it 
individuals position at t iterations. 

The cyclone foraging is the second foraging strategy that demonstrates the manta ray’s spiral-shaped 
movement toward the task. The cyclone foraging mathematical formation is represented as (9): 


p Xbest + rand; (xPoce re xP (1)) + wats(xPese =x? (T)), i=1 
xi T+1)= (9) 
Xbest + rand, (xP, > x? (1)) + wats (Pose x x? (1)) iemi#l 
d, (722) sin(2mrandy) 
where, wgt, denotes the weight coefficient, and wgt, = De. Nmax) OTni, tmax represents the 


maximum iteration. 

Despite of this, a good exploration ability is achieved by every meta-heuristic algorithm, which 
gives the entire search space and it is not only around the present best outcome. The mathematical form of 
the exploration is expressed by (10). 


Xana (T) E randı E TA xP (T)) +wgt, (xP ona = xP (T)) , i=1 


D D D . ; (10) 
Xana (T) + rand, (xP -x (T)) +wgt, (xP -x (T)) iemiz#l1 


teso- 


Hence, Yona = randı (b” — b*) + b*. 
Last and the third foraging stage is named as somersault, which is placed to get the end location of 
all individuals and it is expressed by (11) 


XPT +1)=xPT)+68 (randzxfest = randsx#(T)) [Eem i+l1 (11) 
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where, ô represents the somersault factor; r, and r} represents the separate random number. The algorithm 
enhanced T by 1 after the somersault stage and also initiated with the updated iteration. At last, the optimal 
distinct Xpest, Which denoted the individual whose fitness function has very low is the optimal outcome to the 
optimization issue. 


3.4. AGSO based answer selection 

The galactic swarm optimization (GSO) [27] imitates the movement of galaxies and stars in the 
galaxies. By using the individual subpopulations, the best solution is found in the AGSO algorithm. Initially, 
the individuals in each subpopulation are attracted to better solutions according to the particle swarm 
optimization (PSO). Next, every subpopulation is expressed to by the best arrangement found by the 
subpopulation and treated as a super swarm. 

Here, the swarm is signified by the way of a Ð set containing 9 ) essentials containing of X 
partitions known as sub swarms €, in size n. Whole components of the set 0 are prepared in the search space 
denoted as [emaxmin”]. There are two stages considered in the GSO optimization algorithm. 

— Stage 1 

The search space independently explored by each sub swarm. By computing the position and 

velocity of the particles, this process is initialized. 


v <— wyv* + randı (by? — e0?) + a,rand,(g® — ey) (12) 
(@) €3) (x) 
Ey ey tvy (13) 
(x) 


Where the velocity of the particle is denoted as v% , the best solution found until that moment is represented 


as ee the global solution is denoted as g™ and e( represents the current particle position, the acceleration 
constants are represented as a,and @. The accelerations give direction to the best local solutions and global. 
The initial weight is represented as w; and the random numbers between 0 and 1 are depicted as rand; and 
rand. 
— Stage 2 

In order to form a super swarm, the next stage of clustering the global best solution is taking place. 
The new super swarm 6 is created by the global best solutions of the €,.’s compilation. 


6°) © 6:4 = 1,2,.....X (14) 
5 = Jo (15) 
According to (16) and (17), super swarm’s location and velocity are updated in the second level clustering. 
v® e wyv* + azrand;(b™ — 5) + ayrand,(g — 6) (16) 
69 e 6H) 4 yo (17) 


Where, the velocity is linked with 6;is represented by v™, the best personal solution is b™ and w2 is the 
weight of inertia, and the random values are rand; and rand4. At this level, the best global is denoted as g, 
and it does not update an unless a better result is found. To determine the parameter’s optimal values, 
improving the GSO’s performance is significant, where the adaptation of the œ} and a, parameters will be 
performed. The best solutions are obtained by the fuzzy systems. Three membership function such as, 
minimum, moderate and maximum are utilized in the Mamdani type fuzzy system with the iteration input. 
The variables of input are lies among the range of 0 to 1. The fraction of entire amount of iterations 
are denoted by the variable of iteration. The quantity of the diffusion of particles are denoted by diversity. 
The variable of output is lies among the range of 0 to 3. The cognitive coefficient is signified asa; that stated 
the significance of the finest prior location of the particle. The social coefficient is signified asa, that 
expressed the significance of the finest global location of the swarm. Then fuzzy rules are constructed after 
generating the fuzzy systems. Here, the social factor is reduced, and the cognitive factor is enhanced. 
- Rulel: if (iteration is minimum) then (@3is minimum) (æ4is maximum) 
- Rule 2: if (iteration is moderate) then (@3 is moderate) (@,4 is moderate) 
- Rule 3: if (iteration is maximum) then (a@3is maximum) (a, is minimum) 
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Finally, accurate answers are obtained by the proposed MDCBiLSTM-AGSO framework for complex 
questions. The performance of the proposed complex question answering is examined with different existing 
techniques in the following results and discussion section. 


4. RESULTS AND DISCUSSION 

In this section, the performance analysis of the proposed method with existing schemes are 
discussed. The experimental results of the proposed methodology is performed in the Java working platform. 
The hardware specifications of the experimental setup is Intel i3 processor, and 8 GB RAM CPU. The 
proposed scheme performance is evaluated using several evaluation metrics like F-measure, precision, 
accuracy and recall. Here, the proposed MDCBiLSM-AGSO approach is equated with the previous strategies 
named complex question answering by Japan Advanced Institute of Science and Technology (JAIST), 
International Committee of the Red Cross ICRC), AI2 Reasoning Challenge I (A-ARC I) and region based 
convolutional neural network (RCNN). Also, the performance of the introduced scheme is compared with 
two existing works such as T-CRNN and HFO-KC. The enhanced performance of the designed strategy 
displays the efficiency of the technique. 

The existing approaches are RCNN is the general neural network methodology utilized for 
modelling deep context in question answering, A-ARC I includes attentive deep neural network framework 
in order to learning the deterministic data for the selection of answers from an common perception [28], 
JAIST uniting the multiple set of features for the selection of answers in the communal question answering 
[29], ICRC is developing classification methodology choosing the accurate answers in the communal 
question answering [30], template representation based convolutional recurrent neural network (T-CRNN) is 
selecting accurate answers in a CQA framework and hierarchy based firefly optimized k-means clustering 
(HFO-KC) choosing the better quality of answers from the hierarchy. 


4.1. Dataset description 

In this paper, the dataset named as Factoid Q&A Corpus is utilized for CQA, and it includes 1,714 
factoid queries which are generated manually [31]. From the University of Pittsburgh and Carnegie Mellon 
University, the answer to the question is gathered between 2008-2010. The performance analysis of this 
paper is discussed in terms of several documents and the average values. 


4.2. Performance metrics 

The performance analysis of introduced scheme is evaluated based on the F-measure, precision, 
accuracy and recall. The two measures, recall (R) and precision (P) are the fundamental and most 
regulareventsutilized in the retrieval of information. The most relevant data in the portion of the retrieved 
document is known as precision. 


relevant items retrived relevant 
p = —_—__— =P ( ; ) (18) 
retrieved items retrieved 
The retrieved data in the portion of the relevant documents are called as Recall. 
relevant items retrived retrieved 
LS oR ( ) (19) 
retrieved items relevant 
The fraction of accurate classification is called as accuracy. 
(tp+tn) 
Accuracy = ————_ (20) 
tp+fpttnt+fn 
Trades off between the precision vs. recall is known as F-measure and is said to be a single measure. 
2PR 
F — measure = — (21) 
P+R 


4.3. Performance analysis 

This section demonstrates the performance analysis of the proposed approach with the existing 
schemes and the existing works in terms of precision, recall, accuracy, and F-measure. It shows that the 
introduced scheme achieved better outcomes than implemented outcomes and the existing works. Figure 3 
demonstrates the performance analysis of the proposed scheme in terms of precision with existing schemes 
named A-ARC I, JAIST, RCNN, and ICRC. By varying the number of documents to 300, 500, 700 and 1000, 
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the precision value of the introduced scheme is achieved near-optimal performance. When the number of 
documents are 300, 500, 700 and 1000, the precision value of the proposed scheme are 0.991, 0.989, 0.985, 
and 0.981 respectively. 

Figure 4 show the performance of recall, which calculates the properly predicted positive 
observations from the whole observations. For 300, 500, 700 and 1000 documents, RCNN recall values are 
0.56, 0.55, 0.54 and 0.53, respectively, for ICRC, the recall values are 0.56, 0.5, 0.53 and 0.52, for JAIST, 
0.57, 0.565, 0.56 and 0.555 respectively. The introduced CQA scheme achieved the recall values are 0.9356, 
0.91, 0.9 and 0.898. 


1 = 1 r 
Ñ N q 
0.8 + 0.8 f 
§ 0.65 = 9.6 WE 
a S : À 
2 è 
&@ 0.4} 0.4} 
| N NBAHI aa 
82 SS ReNN EEEA-ARC! 
Eicre SSS 
ERZA JAIST 
0 K l:l TAS a S a EEE BCECE FEY 0 
500 700 1000 500 
Number of Documents Number of Documents 
Figure 3. Performance analysis of precision Figure 4. Performance analysis of recall 


F-measure performance comparison of the proposed method with existing schemes in terms of the 
number of documents are shown in Figure 5. The weighted average among recall and precision is known as 
f-measure. The proposed and the existing schemes such as RCNN, ICRC, JAIST, A-ARC I achieved 0.964 
and 0.56, 0.57, 0.57, and 0.58, respectively for 300 documents. For 500 documents, they have achieved 0.96 
and 0.55, 0.555, 0.56, 0.57, considerably. For 700 documents, they have achieved 0.959 and 0.545, 0.5, 0.55, 
and 0.56, considerably. For 1000 files, the proposed and the RCNN ICRC, JAIST, A-ARC I achieved 0.92 
and 0.54, 0.45, 0.54, and 0.55 respectively. 

The accuracy performance of the designed strategy with existing schemes is in Figure 6. The 
proposed design accuracy is very high compared to the existing schemes. The accuracy value obtained for 
RCNN, ICRC, JAIST, A-ARC I and proposed scheme for 1000 documents are 0.7, 0.65, 0.7, 0.73 and 0.971. 
The accuracy of proposed for 300, 500, and 700 documents are 0.994, 0.988, and 0.979 respectively. 
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Figure 5. Performance analysis of F-measure Figure 6. Performance analysis of accuracy 
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Figure 7 demonstrated the average performance comparison of the proposed approach with existing 
schemes in terms of performance measures like accuracy, F-measure, precision, and recall. The average 
precision values of the proposed approach with existing schemes like RCNN, ICRC, JAIST, A-ARC I are 
0.9865 and 0.545, 0.545, 0.5475, and 0.57 respectively. The efficiency of the introduced approach is 
improved precision. A number of answers and recall value is indirect proportion to each other. The average 
recall values of proposed approach with existing schemes are 0.9336 and 0.6, 0.55, 0.5625, and 0.579, 
respectively. The average f-measure value achieved by RCNN, ICRC, JAIST, A-ARC I and the proposed 
scheme are 0.548, 0.516, 0.555, 0.565 and 0.9507, respectively. The proposed scheme’s F-measure value is 
high when equated with the other schemes. The average accuracy values for existing CQA methods and the 
proposed scheme are 0.70875, 0.66625, 0.71125, 0.745 and 0.983, respectively. 

Figure 8 demonstrated the precision comparison with existing works, the introduced scheme 
achieved maximum precision in the CQA framework. The achieved precision value of proposed and existing 
HFO-KC and T-CRNN approaches for 300 documents are 0.991, 0.99, 0.98, for 500 documents are 0.989, 
0.98, 0.97, for 700 documents are 0.985, 0.96, 0.95, and for 1000 documents are 0.981, 0.94, 0.93 
respectively. Our proposed scheme has the best classification approach name Manta-Ray optimized 
DCBi-LSTM, which produced a better classification and the AGSO algorithm does the optimal ranking. 
Hence, the introduced scheme obtained maximum precision rate. Recall performance comparison of the 
introduced scheme with existing works is demonstrated in Figure 9. The achieved recall value of proposed 
and existing HFO-KC and T-CRNN approaches for 300 documents are 0.9356, 0.93, 0.88, for 500 
documents are 0.91, 0.9, 0.86, for 700 documents are 0.90, 0.89, 0.85 and for 1000 documents are 0.989, 
0.87, 0.8 respectively. It demonstrated that the recall value of the introduced scheme is high compared to 
others. Compared to the existing works, the proposed scheme utilized best similarity measurement, and 
classification approach, which are produced the maximum outcomes that are existing two works. 
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F-measure performance comparison of the introduced scheme with existing works is demonstrated 
in Figure 10. The achieved F-measure value of proposed and existing HFO-KC and T-CRNN approaches for 
300 documents are 0.964, 0.957, and 0.927. For 500 documents are 0.96, 0.941, 0.911 for 700 documents are 
0.959, 0.938, 0.897, and for 1000 documents are 0.92, 0.90, and 0.86 respectively. It demonstrated that the 
F-score value of the introduced scheme is high compared to others. It is the harmonic mean of recall and the 
precision value. The precision and recall value of the proposed scheme is high compared to the other two 
existing works. Hence, the F-measure value has also been high compared to other approaches. 

Accuracy performance comparison of the introduced scheme with existing works is demonstrated in 
Figure 11. It demonstrated that the accuracy value of the introduced scheme is high compared to others. Our 
proposed classification scheme provided better classification accuracy, and the AGSO algorithm produced 
the optimal answer ranking. Hence, compared to the other two existing schemes, the introduced approach 
produced maximum accuracy. Accuracy results are compared to the number of documents like 300, 500, 700 
and 1000. The proposed scheme achieved a maximum of 0.994, 0.988, 0.979, and 0.971, respectively. The 
accuracy of existing HFO-KC works for 300, 500, 700 and 1000 are 0.991, 0.982, 0.972 and 0.962 
respectively. Similarly, accuracy of existing T-CRNN work for 300, 500, 700 and 1000 are 0.989, 0.98, 0.97, 
and 0.96 respectively. This proved that the proposed methodology attains the improved accuracy than the 
compared approaches. 
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5. CONCLUSION 

This paper presented a manta ray optimized MDCBiLSTM-AGSO approach for an effective answer 
selection in CQA system. Initially, several questions are decomposed from the input query. By the usage of a 
knowledge base, the proposed scheme accomplished with replacing each entity of the question with the 
template. After that, the sentence matrix of question and answer pair has been generated. Then, it is 
represented in vector form by the approach of pre-trained word embedding. Next, the new InfoSelectivity 
similarity approach used to obtain the semantic matching pattern between question and answer. Then the 
scoring for each answer is computed. The classification is done based on the proposed MDCBiLSTM 
approach, and the proper answers are classified by this neural network. Finally, the best answer is displayed 
by AGSO optimization. The proposed scheme is implemented on JAVA platform and the performance of the 
introduced approach is evaluated with different performance parameters such as F-measure, accuracy, 
precision, and recall. Compared to the existing schemes, the proposed approach achieved better outcomes 
than others. In future works, hybridized approaches will be utilized to further enhancement of the presented 
approach. Moreover, an effective features for at training accurate answers will be focused. 
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